Distributed Search for Structured Documents
نویسندگان
چکیده
The quality of search performed by a search engine can be improved by letting the user specify conditions on the information structure associated with document content. Some centralised search engines have implemented structure-aware search. However, in a distributed heterogeneous search system, structured search is more difficult. The difficulties are associated with the different structural organisation of documents in different databases. This complicates presentation of query forms to the user and the adaptation of queries for propagation to different target systems. Selection of targets for query propagation also needs to be structure-aware. This paper presents an approach taken in the distributed search system being developed by the authors.
منابع مشابه
Indexing for Vertical Search Engine: Cost Sensitive
The information on the WWW is growing exponentially and the dynamic, unstructured data & structured data needs to locate as useful resources, web pages and online database in enormous quantity. In this paper we propose the novel indexing technique to download the hidden web pages which is based on domain specific. This technique keeps the related documents in the same domain so that searching o...
متن کاملEffective Structured Query Formulation for Session Search
In this work, we emphasize on formulating effective structured queries for session search. For a given query, phrase-like text nuggets are identified and formulated into Lemur queries to feed into the Lemur search engine. Nuggets are substrings in qn, similar to phrases but not necessarily as semantically coherent as phrases. We assume that a valid nugget appears frequently in top returned snip...
متن کاملCooperating Peers for Content-Oriented XML-Retrieval
Semi-structured documents formatted with the extensible markup language (XML) are gaining wide use by a whole range of applications including E-Commerce, E-Business, EScience, Digital Libraries (DL), File Sharing, and in the last years especially by applications for Peer-to-Peer (P2P) systems. P2P architectures have been identified as an efficient means of ad-hoc collaboration and information s...
متن کاملThe quality of probabilistic search in unstructured distributed information retrieval systems
Searching the web is critical to the Web’s success. However, the frequency of searches together with the size of the index prohibit a single computer being able to cope with the computational load. Consequently, a variety of distributed architectures have been proposed. Commercial search engines such as Google, usually use an architecture where the the index is distributed but centrally managed...
متن کاملStructured and Distributed Cooperative Editing in a Large Scale Network
In this chapter we discuss the advantages of a structured model of documents in a cooperative editor. The discussion is based on the experience gained in developing and using Alliance, a groupware application that allows several users distributed on a network to cooperate for producing documents in a structured way. In addition to the local editing functions made available on each site by a str...
متن کامل